Introduction

New York City is amongst the most populated cities in the world with about 8.6 million people (2017). It is also the melting pot of America with representation from all races.

We came across many articles speaking about racial segregation in NYC. https://www.nytimes.com/2019/07/16/nyregion/segregation-nyc-affordable-housing.html refers to how the city’s policy of giving preference to local residents for new affordable housing helps perpetuate racial segregation. As per https://www.nytimes.com/2019/03/26/nyregion/school-segregation-new-york.html, NYC public schools are still struggling with racial segregation.

We were curious to look at demographic data to find out the racial distribution of population in New York City to help answer the following questions:

  1. Are races segregated across NYC?
  2. How are races segregated in New York City by boroughs and zip codes?
  3. Is segregation of races consistent with racial segregation in schools?

Datasets

We found multiple datasets are available for demographic and race data. We chose to focus on the following datasets :

  1. US Government Census data includes data uptil 2018 regarding race and ethnicity by county and zipcode

https://data.census.gov/cedsci/profile?g=0400000US36&q=New%20York

Census data is collected every 10 years (decennial survey). Last survey was conducted in 2010.

The other alternative survey peformed by sampling the population is the American Community Survey (ACS), performed yearly. The datasets available from ACS are either yearly or average of past 5 years. The 5 year average is considered the most reliable. We choose to use the 2013-2017 5 year average data for our study.

The ACS data contains population numbers for the following races

  1. DOE New York City schools demographic data between 2013-2018

To analyze how segregated are NYC schools among the zip codes, below datasets were jonned with the demographic dataset, to source the zip codes for each school.

Analysis

Distribution of Races by Borough

We found choroplethr packages to map population by demographics to boroughs and zipcodes on a choropleth map of NYC.

Choropleth maps allow us to make comparisons before data corresponding to geographical areas spatially. The borough maps were useful to get a sense of where each race was concentrated at the Borough level.

Observations from Borough Choropleth maps:

  • population of White people is highest in Brooklyn than other boroughs
  • population of Black people is highest in Brooklyn than other boroughs
  • population of Asians people is highest in Queens than other boroughs
  • population of Hispanic people is highest in Bronx than other boroughs

Distribution of Races by Zip Code

Choroplethr maps by zip code allowed us to identify clusters of zip codes where each race was concentrated.

Observations from Choropleth maps By Zipcodes:

  • High concentration of white population can be noticed in neighborhoods in lower Brooklyn, Central Staten Island and Upper West Side
  • High concentration of black population can be noticed in neighborhoods in southern Brookyln, Southern Queens
  • High concentration of Hispanic of populated can be noticed in neighborhoods in Bronx

Studying Racial concentration by Zipcodes using Bar Charts

We noticed that Zipcodes in NYC have unique patterns per Borough:

  • 100** for Manhattan (with the exception of 10280)
  • 103** for Staten Island
  • 104** for Bronx
  • 112* for Brooklyn
  • 11*** for Queens (except 112*)

We can get a sense of racial concentration by borough by plotting the populations per race against the sorted list of Zip Codes.

Observations from Bar Plots Per Zipcode for Individual Races:

  • High concentration of whites in Brooklyn
  • High concentration of blacks in Brooklyn yet sparser than whites

Comparing Races

Next we compared the populations between races across all the zip codes. This was to identify if there was a larger concentration of any one race wrt others across the zipcodes in NYC.

Observations from Stacked Bar Plots Per Zipcode:

  • There are zipcodes with predominantly white population (10128, 11211, 11385)
  • There are zipcodes with predominantly black population (10466, 11236)
  • There are zipcodes with predominantly hispanic population (10468)

Comparing Racial Population by Neighborhood

Next we studied each neighborhood in NYC as per neighborhoods listed here - https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm

The goal was to find neighborhoods where populations from one race dominated.

Bronx

Observations about Neighborhoods in Bronx:

  • Neighborhoods in Brnx wre found to have high concentration of Hispanic population compared to other races. Central Bronx, Bronx Park and Fordham, Hunts Point and Mott Haven had the highest concentration of Hispanic population.

Brooklyn

Observations about Neighborhoods in Brooklyn:

  • Central Brooklyn, Canarsie and Flatlands, Flatbush, East New York and New Lots are predominantly Black
  • Southwest Brooklyn, Borough Park, Southern Brooklyn, Northwest Brooklyn are predominantly White

Manhattan

Observations about Neighborhoods in Manhattan:

  • Central Harlem is predominantly Black
  • Chelsea and Clinton, Gramercy Park abd Murray Hill, Greenwich Village and Soho, Lower Manhattan, Upper West Side is predominantly White

Observations about Neighborhoods in Queens:

  • West Central, Northwest Queens has much higher number of whites than other races
  • Jamaica and South East Queens has a much higher number of blacks than other races

Staten Island

Observations about Neighborhoods in Staten Island:

  • Staten Island in general has a higher number of whites than other races
  • South Shore of Staten Island has very high concentration of whites compared to other races

NYC school segregation analysis

Anaysis to find out if segregation of races is consistent with racial segregation in schools

This simple bar charts shows total number of students enrolled in each Borough.

Brooklyn shows higher number of students enrolled over the 5 years of 2013 - 2018 and Staten Island seems to have the lowest of the 5 boroughs.

Percentage of students for each race

We extracted few required variable to look at the percentage of enrolled students in NYC schools and plot its distribution for each race.

The bar chart shows a higher concentration of Hispanic and Black students, which is a significant observation that impacts few observations made below as we proceed further in the analysis.

Race segregation in NYC schools by boroughs

We observe the following patterns in the stacked bar chart above for analysis on different ethnicities in NYC schools for each Borough.

  • There seems to be a higher proportion of Hispanic students in the Bronx compared to other boroughs.
  • We also see a higher proportion of Black students in Brooklyn. Since Brooklyn has highest number of enrolled students, we can infer that Brooklyn has the highest number of Black students among all the boroughs.
  • The proportion of white students in Staten Island is significantly higher than other boroughs and the proportion of Asian students is highest in Queens among all the boroughs.
  • Queens seems to be least segregated borough in New York city and the Bronx seems be the most segregated of all the boroughs.

Percent of schools where each race is the largest race per borough

For each school we find the most common race among the enrolled students. In the chart below we compute the percentage of schools in each borough with a particular race as the most common one. With this, we are trying to see if the population of students of a particular race is concenrated in a few regions of the borough or if it is distributed more evenly.

  • A vast majority of schools in the Bronx have a higher propertion of hispanic students to other races. This implies that Hispanic students are distributed through ot the Bronx.
  • In the Bronx, there are very few (if any) schools with high concentration of White and Asian students .
  • Brooklyn seems to have a uniform distribution of Black students and we also see few schools in Brooklyn have high concentration of Asian and white students.
  • White students seem to be evenly distributed in Staten Island and Hispanic students seems to concentrated in few schools.

Mean Segregation score per borough and (by top race) over time

In the next two plots we intend to show how segregation score changing over time, during the 5 years of data used in the project.

For this we defined a metric called seggregation_score to capture the level of segregation of each school in New York city. This allows us to compare different schools with different distribution of students’ ethnicities. seggregation_score = prop_common - mean(prop_others) where prop_common is the proportion of the most common race in the school and prop_others is the propertions of the other races.

This can be simplfied to: seggregation_score = (1.25 * prop_common) - 25 We know that this only considers the most common ethnicity in each school and hence does not differentiates schools based on the proportion of other ethnicities.

  • Brooklyn and the Bronx have higher segregation score than other boroughs.
  • Segregation score has been decreasing over the time of 5 years for Brooklyn and Staten Island and it has not changed much for Queens and Manhattan.
  • Number of Black students entolled in schools seems to be highest compared to students of other ethnicites, but over the time of 5 years of data, segregations score for Black student has decreased.
  • Number of White and Asian students enrolled seem to be lowest among other and segregation score has not changed much for the 5 years of the data used.
  • Segregation score for Hispanic students has also not changed much during these 5 years.

Race segregation analysis in NYC schools by zip code

The Choropleth map below shows the racial segregation of NYC schools by zip code for the 5 years of data.

Zip codes with highest segregations score seems to be highly clustered in few regions on the above map.

Most segregated schools in NYC

  • Top 10 most segregated zip codes are closely clustered in 4 clusters as shown above. This could mean that there are certain areas where segregation is high, which could be a result of underlying population segregation.
  • In the second map, we see that the top 10 least segregated zip codes are more distributed.

Most common race by zip code in a choropleth

All ethnicites seems to be highly clustered in the map above which could be due to the underlying population being clustered along racial lines. It is very surprising to see such contiguous clusters with the same race being the most common one.

Analysis on students enrolled in NYC schools by zip code

Following two plots show the consistency between NYC schools enrollment with underlying population which has been plotted above under the first section of this project under the name Distribution of Races by Zip Code. The second plot specifically shows the school enrollment for the year of 2017, which shows that the highly populated areas have higher number of students enrolled to the schools.

Relation between Poverty percentage and segregation score in NYC schools

Poverty is worth considering when talking about school diversity.

We observe Black and Hispanic students are much more likely to attend a school where more than 75% of students experience poverty.

Relation betwen Economic need index and segregation score

Following plot shows a similar pattern as above, we observe higher economic need index for Black and Hispanic students in NYC over the 5 years of data.

Conclusion:

References:

https://dlab.berkeley.edu/sites/default/files/training_materials/Census_Lecture_030915.pdf https://rdrr.io/cran/choroplethr/man/county_choropleth_acs.html https://arilamstein.com/creating-zip-code-choropleths-choroplethrzip/ https://www.trulia.com/blog/tech/the-choroplethr-package-for-r/